Skip to content

Add support for a reducer step that aggregates per-user metrics#76

Merged
xkrogen merged 10 commits into
linkedin:masterfrom
csgregorian:cgregori-per-user-metrics
Feb 16, 2019
Merged

Add support for a reducer step that aggregates per-user metrics#76
xkrogen merged 10 commits into
linkedin:masterfrom
csgregorian:cgregori-per-user-metrics

Conversation

@csgregorian

@csgregorian csgregorian commented Feb 13, 2019

Copy link
Copy Markdown
Contributor

The goal of this PR is to let Dynamometer collect and emit per-user metrics, as opposed to the whole-workload measurements that it currently collects using Counters. This is done by first changing the AuditReplayMapper to emit key-value pairs in the form of (username_type, latency) where type is either READ or WRITE. Then, a reducer step AuditReplayReducer is added to sum latencies per key. More generally, this change allows Dynamometer to support arbitrary reducers for stats aggregation along with the existing mappers.

I modified TestWorkloadGenerator to run with and without a reducer, and confirm that the file was created. This still needs a test to ensure that the actual contents of the output file are correct. As well, it's possible that I changed some things that didn't need to be changed in the process of getting the tests to pass/output to appear.

Mostly trying to see if CI will pass, but ready for some preliminary reviews.

EDIT: a couple other things that I'm not sure are the right approach and need to be looked at a bit closer:

  • Not sure whether extending WorkloadMapper to specify KEYOUT and VALUEOUT as parameters is the right move
  • What happens if AuditReplayMapper writes to the context without a reducer being specified? Should job.setMapOutputKeyClass be set to LongWritable.class instead of NullWritable.class regardless of whether it's actually outputting anything? It doesn't seem like this caused any errors, but if not, why?
  • I simplified WorkloadReducer just to what it needs to run, but it should probably have getDescription and getConfigDescriptions as well.

@xkrogen xkrogen left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @csgregorian , this is a great start! Besides the inline comments, I have two general comments:

  • We should add a Combiner to reduce the data transfer volume between Mappers and Reducers.
  • I'm wondering if it makes sense for the WorkloadReducer to be user-configurable. I'm thinking it may be better for it to be specified by the mapper, like how a WorkloadMapper specifies what input format it expects. Since the output keys/values of the mapper must be exactly matched to the input keys/values of the reducer, I don't think it makes sense to configure them independently. Let me know what you think.

@csgregorian csgregorian force-pushed the cgregori-per-user-metrics branch from 757d342 to 67873d7 Compare February 14, 2019 23:38
@csgregorian csgregorian force-pushed the cgregori-per-user-metrics branch from 9b61ead to dc3e754 Compare February 15, 2019 18:15
@csgregorian csgregorian force-pushed the cgregori-per-user-metrics branch from dc3e754 to 75de589 Compare February 15, 2019 18:15
@csgregorian

csgregorian commented Feb 15, 2019

Copy link
Copy Markdown
Contributor Author

Simplified a ton of stuff here.

  • Reducers are local to a WorkloadMapper instead of being user configured
  • WorkloadMappers handle their own configuration
  • AuditReplayThreads no longer write directly to context, instead aggregating all of their command latencies locally. These are then collected during the AuditReplayMapper cleanup, similar to the drainCounters behaviour. This obviates the need for a combiner, as the maximum number of inputs to the Reducer is now unique keys * threads per mapper * mappers, instead of commands per thread * threads per mapper * mappers
  • Changed AuditReplayMapper/Reducer key type to a custom composite writable userCommandKey (not simplified, but the right way to do it [probably])
  • Extended TestWorkloadGenerator to test the output file contents

Ready for more reviews!

@xkrogen xkrogen left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I love all of the changes. I have mostly minor comments throughout, and my only general comment would be that we should update TestDynamometerInfra in some way. We're already asserting on the output within TestWorkloadGenerator, but maybe in the infra test we can at least assert that the -workload_output_path arg works as expected and puts the file in the right place?

@xkrogen xkrogen left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few minor style nits and then it's all good to go!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants